R is an open source programming language used for data analysis and statistical computer and graphics. R syntax consists of variables, comments, and keywords. It was developed in 1993 and is compatible with Windows, Macintosh, UNIX, and Linux platforms.
Random Notes:
Before writing any R code you must first import the needed libraries.
library(tidyverse)
library(simplevis)
library(dplyr)
library(palmerpenguins)
library(sf)
library(leaflet)
library(plotly)
Tidyverse is a library with lots of useful functions for data wrangling. Here are a few examples.
starwars %>%
filter(species == "Droid")
## # A tibble: 6 × 14
## name height mass hair_color skin_color eye_c…¹ birth…² sex gender homew…³
## <chr> <int> <dbl> <chr> <chr> <chr> <dbl> <chr> <chr> <chr>
## 1 C-3PO 167 75 <NA> gold yellow 112 none mascu… Tatooi…
## 2 R2-D2 96 32 <NA> white, bl… red 33 none mascu… Naboo
## 3 R5-D4 97 32 <NA> white, red red NA none mascu… Tatooi…
## 4 IG-88 200 140 none metal red 15 none mascu… <NA>
## 5 R4-P17 96 NA none silver, r… red, b… NA none femin… <NA>
## 6 BB8 NA NA none none black NA none mascu… <NA>
## # … with 4 more variables: species <chr>, films <list>, vehicles <list>,
## # starships <list>, and abbreviated variable names ¹eye_color, ²birth_year,
## # ³homeworld
starwars %>%
select(name, ends_with("color"))
## # A tibble: 87 × 4
## name hair_color skin_color eye_color
## <chr> <chr> <chr> <chr>
## 1 Luke Skywalker blond fair blue
## 2 C-3PO <NA> gold yellow
## 3 R2-D2 <NA> white, blue red
## 4 Darth Vader none white yellow
## 5 Leia Organa brown light brown
## 6 Owen Lars brown, grey light blue
## 7 Beru Whitesun lars brown light blue
## 8 R5-D4 <NA> white, red red
## 9 Biggs Darklighter black light brown
## 10 Obi-Wan Kenobi auburn, white fair blue-gray
## # … with 77 more rows
starwars %>%
mutate(name, bmi = mass / ((height / 100) ^ 2)) %>%
select(name:mass, bmi)
## # A tibble: 87 × 4
## name height mass bmi
## <chr> <int> <dbl> <dbl>
## 1 Luke Skywalker 172 77 26.0
## 2 C-3PO 167 75 26.9
## 3 R2-D2 96 32 34.7
## 4 Darth Vader 202 136 33.3
## 5 Leia Organa 150 49 21.8
## 6 Owen Lars 178 120 37.9
## 7 Beru Whitesun lars 165 75 27.5
## 8 R5-D4 97 32 34.0
## 9 Biggs Darklighter 183 84 25.1
## 10 Obi-Wan Kenobi 182 77 23.2
## # … with 77 more rows
starwars %>%
arrange(desc(mass))
## # A tibble: 87 × 14
## name height mass hair_…¹ skin_…² eye_c…³ birth…⁴ sex gender homew…⁵
## <chr> <int> <dbl> <chr> <chr> <chr> <dbl> <chr> <chr> <chr>
## 1 Jabba Desi… 175 1358 <NA> green-… orange 600 herm… mascu… Nal Hu…
## 2 Grievous 216 159 none brown,… green,… NA male mascu… Kalee
## 3 IG-88 200 140 none metal red 15 none mascu… <NA>
## 4 Darth Vader 202 136 none white yellow 41.9 male mascu… Tatooi…
## 5 Tarfful 234 136 brown brown blue NA male mascu… Kashyy…
## 6 Owen Lars 178 120 brown,… light blue 52 male mascu… Tatooi…
## 7 Bossk 190 113 none green red 53 male mascu… Trando…
## 8 Chewbacca 228 112 brown unknown blue 200 male mascu… Kashyy…
## 9 Jek Tono P… 180 110 brown fair blue NA male mascu… Bestin…
## 10 Dexter Jet… 198 102 none brown yellow NA male mascu… Ojom
## # … with 77 more rows, 4 more variables: species <chr>, films <list>,
## # vehicles <list>, starships <list>, and abbreviated variable names
## # ¹hair_color, ²skin_color, ³eye_color, ⁴birth_year, ⁵homeworld
starwars %>%
group_by(species) %>%
summarise(
n = n(),
mass = mean(mass, na.rm = TRUE)
) %>%
filter(
n > 1,
mass > 50
)
## # A tibble: 8 × 3
## species n mass
## <chr> <int> <dbl>
## 1 Droid 6 69.8
## 2 Gungan 3 74
## 3 Human 35 82.8
## 4 Kaminoan 2 88
## 5 Mirialan 2 53.1
## 6 Twi'lek 2 55
## 7 Wookiee 2 124
## 8 Zabrak 2 80
ggplot(starwars, aes(height, mass)) +
geom_point()
starwars2 <- filter(starwars, name != "Jabba Desilijic Tiure")
ggplot(starwars2, aes(height, mass, colour = species)) +
geom_point()
## Warning: Removed 28 rows containing missing values (geom_point).
ggplot(data = starwars, mapping = aes(x = height)) +
geom_histogram(binwidth = 10)
## Warning: Removed 6 rows containing non-finite values (stat_bin).
ggplot(data = starwars, mapping = aes(x = height)) +
geom_density()
## Warning: Removed 6 rows containing non-finite values (stat_density).
ggplot(data = starwars, mapping = aes(x = gender, fill = hair_color)) +
geom_bar(position = "fill") +
labs(y = "proportion")
p1 <- plot_ly(starwars, type='bar', x = ~species, y = ~sex)
p1
## Warning: Ignoring 4 observations
hde <- starwars %>%
subset(species == "Human" | species == "Droid" | species == "Ewok")
ggplot(hde, aes(species, mass)) +
geom_boxplot()
hd <- starwars %>%
subset(species == "Human" | species == "Droid")
hd <- hd %>%
select(name, height, mass, species)
hd_g <- hd %>%
gather(key = "measurement", value = "value", -name, -species)
ggplot(hd_g, aes(species, value)) +
geom_boxplot() +
facet_grid(~measurement)
GGplot is based on the grammar of graphics Plotting data becomes consistent, flexible, specific, complete, and more when using ggplot.
“Package sf represents simple features as native R objects”. In other words, sf is the “geopandas” or “arcpy” of R. It allows the user to manipulate spatial objects within dataframes. Paired with an open source mapping software like leaflet or mapview, the user can create neat maps and visualizations of the spatial data. Or you can simply use the plot() function for a quick view of your data.
Here are some of the common functions within the sf package.
methods(class = "sf")
## [1] $<- [
## [3] [[<- aggregate
## [5] anti_join arrange
## [7] as.data.frame cbind
## [9] coerce dbDataType
## [11] dbWriteTable distinct
## [13] dplyr_reconstruct filter
## [15] full_join gather
## [17] group_by group_split
## [19] identify initialize
## [21] inner_join left_join
## [23] merge mutate
## [25] nest pivot_longer
## [27] pivot_wider plot
## [29] print rbind
## [31] rename right_join
## [33] rowwise sample_frac
## [35] sample_n select
## [37] semi_join separate
## [39] separate_rows show
## [41] slice slotsFromS3
## [43] spread st_agr
## [45] st_agr<- st_area
## [47] st_as_s2 st_as_sf
## [49] st_as_sfc st_bbox
## [51] st_boundary st_buffer
## [53] st_cast st_centroid
## [55] st_collection_extract st_convex_hull
## [57] st_coordinates st_crop
## [59] st_crs st_crs<-
## [61] st_difference st_drop_geometry
## [63] st_filter st_geometry
## [65] st_geometry<- st_inscribed_circle
## [67] st_interpolate_aw st_intersection
## [69] st_intersects st_is
## [71] st_is_valid st_join
## [73] st_line_merge st_m_range
## [75] st_make_valid st_minimum_rotated_rectangle
## [77] st_nearest_points st_node
## [79] st_normalize st_point_on_surface
## [81] st_polygonize st_precision
## [83] st_reverse st_sample
## [85] st_segmentize st_set_precision
## [87] st_shift_longitude st_simplify
## [89] st_snap st_sym_difference
## [91] st_transform st_triangulate
## [93] st_union st_voronoi
## [95] st_wrap_dateline st_write
## [97] st_z_range st_zm
## [99] summarise transform
## [101] transmute ungroup
## [103] unite unnest
## see '?methods' for accessing help and source code
Here we read in the example file of north carolina.
nc <- st_read(system.file("shape/nc.shp", package="sf"))
## Reading layer `nc' from data source
## `C:\Users\cday\AppData\Local\Programs\R\R-4.2.1\library\sf\shape\nc.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 100 features and 14 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
## Geodetic CRS: NAD27
class(nc)
## [1] "sf" "data.frame"
print(nc[9:15], n = 3)
## Simple feature collection with 100 features and 6 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
## Geodetic CRS: NAD27
## First 3 features:
## BIR74 SID74 NWBIR74 BIR79 SID79 NWBIR79 geometry
## 1 1091 1 10 1364 0 19 MULTIPOLYGON (((-81.47276 3...
## 2 487 0 10 542 3 12 MULTIPOLYGON (((-81.23989 3...
## 3 3188 5 208 3616 6 260 MULTIPOLYGON (((-80.45634 3...
par(mar = c(0,0,1,0))
plot(nc[1], reset = FALSE)
par(mar = rep(0,4))
u <- st_union(nc)
plot(u)
leaflet() %>%
addTiles() %>%
addMarkers(lng=77.1025, lat=28.7041,
popup="Delhi, India")
leaflet(nc) %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(color = "green", popup = paste0(
"<b>NAME: </b>", nc$NAME, "<br>",
"<b>AREA: <b/>", nc$AREA, "<br>"
))
## Warning: sf layer has inconsistent datum (+proj=longlat +datum=NAD27 +no_defs).
## Need '+proj=longlat +datum=WGS84'
“Shiny is an R package that enables building interactive web applications that can execute R code on the backend. With Shiny, you can host standalone applications on a webpage, embed interactive charts in R Markdown documents, or build dashboards. You can also extend your Shiny applications with CSS themes, HTML widgets, and JavaScript actions.”
Example Code: https://github.com/gpilgrim2670/SwimMap/blob/master/app.R
Chris Example: https://christopher-day.shinyapps.io/bus_speeds_viewer/
Bookdown is an open-source R package that can be used to write books, documentation, reports, articles and more with R Markdown. Some of the advantages are: